Using Cohesion and Coherence Models for Text Summarization
نویسندگان
چکیده
In this paper we investigate two classes of techniques to determine what is salient in a text, as a means of deciding whether that information should be included in a summary. We introduce three methods based on text cohesion, which models text in terms of relations between words or referring expressions, to help determine how tightly connected the text is. We also describe a method based on text coherence, which models text in terms of macro-level relations between clauses or sentences to help determine the overall argumentative structure of the text. The paper compares salience scores produced by the cohesion and coherence methods and compares them with human judgments. The results show that while the coherence method beats the cohesion methods in accuracy of determining clause salience, the best cohesion method can reach 76% of the accuracy levels of the coherence method in determining salience. Further, two of the cohesion methods each yield significant positive correlations with the human salience judgments. We also compare the types of discourse-related text structure discovered by cohesion and coherence methods.
منابع مشابه
An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملCohesion and coherence for Automatic Summarization
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...
متن کاملIntegrating cohesion and coherence for Automatic Summarization
This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers. When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting s...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملGenerating Indicative-Informative Summaries with SumUM
s are texts used in tasks such as assessing the content of the document and deciding if the source is worth reading. If text summarization systems are designed to fulfil those requirements, the quality of the generated texts has to be evaluated according to their intended function. The quality of human-produced abstracts has been examined in the literature (Grant, 1992; Kaplan et al., 1994; Gib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998